feat: resolve ES/TS package specifiers for IMPORTS edges (#180)#184
feat: resolve ES/TS package specifiers for IMPORTS edges (#180)#184dLo999 wants to merge 3 commits intoDeusData:mainfrom
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Hey, thanks for publishing this PR. I will tackle this immediately after the 15th of April :) |
…ith upstream Conflicts resolved: - Makefile.cbm: kept pass_pkgmap.c in PIPELINE_SRCS, added upstream's pass_similarity.c + pass_semantic_edges.c, kept new SIMHASH_SRCS / SEMANTIC_SRCS / UNIXCODER_BLOB_SRC sections; dropped stale httplink.c entry (removed upstream) - pass_definitions.c: accepted upstream refactor to create_import_edges_for_file() helper; updated bare-specifier branch to use cbm_pipeline_resolve_module() - pass_parallel.c: accepted upstream refactor to create_imports_edges() helper; updated bare-specifier branch to use cbm_pipeline_resolve_module() - pass_usages.c: accepted upstream flat-loop cleanup (removed shadow variable re-declaration from PR); updated to use cbm_pipeline_resolve_module() - pipeline.c: merged both .pkg_map = pkg_map (PR) and .mode = (int)p->mode (upstream) into ctx initializer; moved pkg_map declaration before first goto cleanup to fix -Wsometimes-uninitialized
There was a problem hiding this comment.
Review Summary
Adds ES/TS package specifier resolution for monorepo IMPORTS edges (closes #180). New pass_pkgmap.c walks the repo, parses package.json files (name + entry point via exports["."]/main/fallback), and builds a CBMHashTable mapping package names to resolved module QNs. cbm_pipeline_resolve_module(ctx, path) wraps fqn_module() at 5 pass sites; falls through for relative/absolute/unknown paths. Zero overhead for non-JS repos (NULL map).
Verification status: branch merged with upstream/main cleanly; 2,740 tests pass unsandboxed (49 "failures" in an earlier sandboxed run were mkdtemp restrictions, not regressions).
Update (commit 4e183de): nit fixes applied. Fix 2 (duplicate-key comment clarification) landed. Fix 1 (NULL-arg ternary split) intentionally skipped — would have been a behavior change (callers free() the return unconditionally; cbm_pipeline_fqn_module already returns strdup("") for NULL project, so the combined guard is load-bearing). 2,740 tests still pass post-fix.
Findings
- [nit]
pass_pkgmap.c:229-241— duplicate package-name handling: comment now explicitly documents thatcbm_ht_setstores the supplied key pointer directly (slot->key = cur.key), so allocating a new key without freeing the old would orphan it. Logic unchanged; rationale now clear. ✅ addressed in4e183de. - [nit]
pass_pkgmap.ccbm_pipeline_resolve_module()NULL-arg ternary — intentionally preserved. The combinedif (!ctx || !module_path)routing throughcbm_pipeline_fqn_module(ctx ? ctx->project_name : NULL, module_path)is load-bearing because callersfree()the return unconditionally andfqn_modulehandles NULL project gracefully viastrdup(""). Splitting would be a behavior change. - [nit]
tests/test_pkgmap.c— 20 new tests cover NULL inputs, scoped/bare packages,exports["."]resolution, subpath resolution, relative/absolute fallthrough, unknown packages, missing fields. Good coverage. - [question] CI is not wired for this PR — no build/test job reports visible. Local unsandboxed
make -f Makefile.cbm testis clean, but upstream CI should confirm across the matrix.
Draft review raised two CRITICAL findings — both verified false positives
A prior draft review (based on static diff inspection) flagged:
-
"
cbm_ht_get_key()is undefined" — refuted. Declared atsrc/foundation/hash_table.h:49, defined atsrc/foundation/hash_table.c:186, used acrossregistry.c,graph_buffer.c, with dedicated tests intest_hash_table.c:187-205(including NULL safety). The draft also cited wrong line number (claimed line 304; actual call is line 239). -
"Use-after-free in duplicate key handling" — refuted by careful struct-ownership trace.
cbm_ht_get(...)returns the VALUE (cbm_pkg_entry_t*).free(prev)frees the value struct only; the key is stored separately inslot->keyand is not touched.cbm_ht_get_key()returnsslot->key, which remains valid. The subsequentcbm_ht_set(pkg_map, existing_key, entry)matches the same key viastrcmp, updatesslot->value, and reassignsslot->keyto the same pointer (hash_table.c:244). No key memory is freed or invalidated in this flow.
Both claims would have been caught by actually running the build/tests, which do in fact pass.
CI Status
No CI checks reported on feat/es-ts-package-map. Local build + test (unsandboxed): clean, 2,740/2,740 pass (both pre- and post-fixup).
Verdict
APPROVE — semantically correct, consistent integration across 5 pass sites, good test coverage, zero-overhead fallthrough for non-JS repos. Recommend upstream CI wire-up as a separate concern.
Fix 2 only: the duplicate-package branch reuses the existing strdup'd key in the hash-table slot rather than allocating a fresh one. Document why this is intentional — cbm_ht_set stores the supplied pointer directly, so allocating a new key without freeing the old one would be a leak. Fix 1 (NULL-arg ternary split) was evaluated and skipped: splitting the combined !ctx || !module_path guard would change the !module_path return value from fqn_module(project_name, NULL) (a valid strdup'd string) to NULL, which callers free() unconditionally — a regression risk.
Closes #180
Summary
Adds ES/TS package specifier resolution so that
import { foo } from '@myorg/storage-utils'produces real IMPORTS edges. Previously,fqn_module()treated npm package specifiers as file paths, producing QNs that matched no graph nodes — result: zero IMPORTS edges for all package imports.How it works
Package map build (new Phase 1.5 in
pipeline_run()): Walks the repo forpackage.jsonfiles, parsesname+ entry point (exports["."]→main→src/index.tsfallback). Builds aCBMHashTablemapping package names to resolved module QNs.cbm_pipeline_resolve_module(ctx, module_path): New wrapper called instead offqn_module()at IMPORTS-edge creation sites. If the specifier is a package reference and found in the map, returns the mapped QN. Handles subpath imports (@myorg/pkg/utils) by resolving relative to the package directory. Falls through tofqn_module()for relative paths and unknown packages.Zero behavior change for non-JS repos: When no
package.jsonfiles exist,pkg_mapis NULL and all resolution falls through tofqn_module(). 0ms overhead.Changes
src/pipeline/pass_pkgmap.c(new) — package map build, free, and resolve functionssrc/pipeline/pipeline_internal.h—pkg_mapfield on ctx,cbm_pkg_entry_tstruct, prototypessrc/pipeline/pipeline.c— Phase 1.5 build step, ctx initialization, cleanupsrc/pipeline/pass_definitions.c:297—resolve_module()instead offqn_module()src/pipeline/pass_calls.c:91— samesrc/pipeline/pass_usages.c:96— samesrc/pipeline/pass_semantic.c:81— samesrc/pipeline/pass_parallel.c:647— same (incbm_build_registry_from_cache)Makefile.cbm— addedpass_pkgmap.candtest_pkgmap.ctests/test_pkgmap.c(new) — 20 unit testsTest results
Build: Compiles cleanly on macOS (Apple Clang, arm64) with
-Wall -Wextra -WerrorTest suite: 2761 passed, 0 failed (was 2741 — 20 new pkgmap tests)
Behavioral verification (monorepo matching issue scenario):
Created pnpm-like monorepo:
MATCH (a)-[r:IMPORTS]->(b) RETURN a, bapps/server/src/__file__→packages/storage-utils/src/indexMATCH (a)-[r:CALLS]->(b) RETURN a, bpkg_mapphase timingEdge cases tested (unit tests):
./utils→ falls through@test/utils→ exact match@test/utils/helpers→ resolves relative to pkg dirreact→ falls through"name"→ skippedexports["."]["import"]conditional → resolvedpackage.jsonin repo → NULL map, 0msScope boundary (not addressed)
./utils) — already works@/components) — separate issuenode_modules/) — outside project graph by designGenerated with agent-team via /issue